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1 Introduction 



These lecture notes provide an introduction to classical and quantum Monte 
Carlo simulations for magnetic models, with an emphasis on non-local up- 
dates, as well as an introduction to exact diagonalization. These notes are 
based in part on a course on algorithms for many-body problems taught at 
ETH Zurich from 1999-2004. For details of these methods and for references 
to the original literature I refer to the references in the review that will also 
be handed out. 

2 Monte Carlo integration 

In thermodynamics, as in many other fields of physics, often very high dimen- 
sional integrals have to be evaluated. Even in a classical iV-body simulation 
the phase space has dimension QN, as there are three coordinates each for 
the location and position of each particle. In a quantum mechanical problem 
of N particles the phase space is even exponentially large as a function of N. 

2.1 Standard integration methods 

A Riemannian integral f(x) over an interval [a, b] can be evaluated by re- 
placing it by a finite sum: 



/ /(x)rfx = ^/(a + iAa;)Aa; + 0(Ax 2 ), 
Ja i=i 



where Ax = (a — b)/N. The discretization error decreases as 1/N for this 
simple formula. Better approximations are the trapezoidal rule 



rb 

/ f(x)dx = 

Ja 



Ax 



1 N ~ x I 

z i=l z 



+ 0(Ax 2 ), (2) 



or the Simpson rule 



rb 

/ f(x)dx 

J a 



Ax 



N/2 



/(a) + ^4/(a + (2i-l)Ax) 
i=i 

N/2-1 

+ E 2f(a + 2iAx) +f(b) +0(Ax 4 
i=i 



(3) 



which scales like iV . 
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For more elaborate schemes like the Romberg method or Gaussian inte- 
gration we refer to textbooks. 

In higher dimensions the convergence is much slower though. With iV 
points in d dimensions the linear distance between two points scales only 
as N~ l l d . Thus the Simpson rule in d dimensions converges only as N" 4 ^ d , 
which is very slow for large d. The solution are Monte Carlo integrators. 



2.2 Monte Carlo integrators 

With randomly chosen points the convergence does not depend on dimension- 
ality. Using N randomly chosen points Xj the integral can be approximated 
by 

1 r 1 N 

^//(x)dx« /:=-£/(:*), (4) 

i=i 

where Q := J <ix is the integration volume. As we saw in the previous chapter 
the errors of such a Monte Carlo estimate the errors scale as N^ 1 ^ 2 . In d > 9 
dimensions Monte Carlo methods are thus preferable to a Simpson rule. 



2.2.1 Importance Sampling 

This simple Monte Carlo integration is however not the ideal method. The 
reason is the variance of the function 



Var/ = IT 1 J /(x) 2 dx - ft" 1 J /(x)dx 



N -2 

' (P-f). (5) 



N- 1 



The error of the Monte Carlo simulation is 

A = 



/Var/ 



N 



P-f 



(6) 



In phase space integrals the function is often strongly peaked in a small 
region of phase space and has a large variance. The solution to this problem 
is "importance sampling" , where the points Xj are chosen not uniformly but 
according to a probability distribution p(x) with 

y P (x)dx=i. (7) 

Using these p-distributed random points the sampling is done according 



to 
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and the error is 



A = ff? (9) 

It is ideal to choose the distribution function p as similar to / as possible. 
Then the ratio f/p is nearly constant and the variance small. 

As an example, the function f(x) = exp(— x 2 ) is much better integrated 
using exponentially distributed random numbers with p{x) = exp(— Xx) in- 
stead of uniformly distributed random numbers. 

A natural choice for the weighting function p is often given in the case 
of phase space integrals or sums, where an observable A is averaged over all 
configurations x in phase space where the probability of a configuration is 
p(x). The phase space average (A) is: 

2.3 Markov chains and the Metropolis algorithm 

In general problems with arbitrary distributions p it will not be possible to 
create ^-distributed configuration from scratch. Instead a Markov process 
can be used. 

Starting from an initial point x a Markov chain of states is generated: 

x -> Xi -> x 2 -> . . . -> X„ -> X n+ i -> . . . (11) 

A transition matrix W xy gives the transition probabilities of going from state 
x to state y in one step of the Markov process. As the sum of probabilities 
of going from state x to any other state is one, the columns of the matrix W 
are normalized: 

E^xy = l (12) 

y 

A consequence is that the Markov process conserves the total probability. 
Another consequence is that the largest eigenvalue of the transition matrix 
W is 1 and the corresponding eigenvector with only positive entries is the 
equilibrium distribution which is reached after a large number of Markov 
steps. 

We want to determine the transition matrix W so that we asymptotically 
reach the desired probability p x for a configuration i. A set of sufficient 
conditions is: 

1. Ergodicity: It has to be possible to reach any configuration x from 
any other configuration y in a finite number of Markov steps. This 
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(15) 



means that for all x and y there exists a positive integer n < oo such 
that (W n ) xy ^ 0. 

2. Detailed balance: The probability distribution p^ changes at each 
step of the Markov process: 

E^ n) ^x y = pSr +1) . (13) 

x 

but converges to the equilibrium distribution p x . This equilibrium dis- 
tribution p x is an eigenvector with left eigenvalue 1 and the equilibrium 
condition 

E^xy=P y (14) 
x 

must be fulfilled. It is easy to see that the detailed balance condition 

^xy = Py 

is sufficient. 

The simplest Monte Carlo algorithm is the Metropolis algorithm: 

• Starting with a point X; choose randomly one of a fixed number N of 
changes Ax, and propose a new point x' = + Ax. 

• Calculate the ratio os the probabilities P = p x > /p Xi ■ 

• If P > 1 the next point is x i+1 = x' 

• If P < 1 then Xj +1 = x' with probability P, otherwise x i+1 = Xj. We 
do that by drawing a random number r uniformly distributed in the 
interval [0, 1[ and set x i+ i = x' if r < P. 

• Measure the quantity A at the new point Xj+i. 

The algorithm is ergodic if one ensures that the iV possible random 
changes allow all points in the integration domain to be reached in a fi- 
nite number of steps. If additionally for each change Ax there is also an 
inverse change —Ax we also fulfill detailed balance: 

Wij = ^min(l,p(j)/p(i)) = p(j) 
Wji ^min(l,p(i)/p(j)) p(i)' 

As an example let us consider summation over integers i. We choose iV = 
2 possible changes Ai=±l and fulfill both ergodicity and detailed balance as 
long as p{i) is nonzero only over a finite contiguous subset of the integers. 
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To integrate a one-dimensional function we take the limit N — > oo and 
pick any change 5 G [—A, A] with equal probability. The detailed balance 
equation (16) is only modified in minor ways: 



Wij _ £mm(l,p(j)/p(i)) _ p(j) 



(17) 

Wji £mm(l,p(i)/p(j)) p(i)' 

Again, as long as p(x) is nonzero only on a finite interval detailed balance 
and ergodicity are fulfilled. 

2.4 Autocorrelations, equilibration and Monte Carlo 
error estimates 

2.4.1 Autocorrelation effects 

In the determination of statistical errors of the Monte Carlo estimates we 
have to take into account correlations between successive points Xj in the 
Markov chain. These correlations between configurations manifest them- 
selves in correlations between the measurements of a quantity A measured in 
the Monte Carlo process. Denote by A(t) the measurement of the observable 
A evaluated at the t-th Monte Carlo point x t . The autocorrelations decay 
exponentially for large time differences A: 

(A t A t+A ) - (A) 2 oc exp(-A/rf p) ) (18) 

Note that the autocorrelation time ta depends on the quantity A. 

An alternative definition is the integrated autocorrelation time r A mt \ de- 
fined by 

.W_ Eti({M + A)-(A) 2 ) , . 

" (A 2 ) - (A) 2 { ' 

As usual the expectation value of the quantity A can be estimated by the 
mean: 

i 

The error estimate 



. , /VarA 

AA =i— (21) 

has to be modified because consecutive measurements are correlated . The 
error estimate (A A) 2 is calculates as the expectation value of the squared 
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difference between sample average and expectation value: 

{AAf = ((A-(A)y} = (^jtA(t)-(A}' ) j ) 
1 N 



i=l 
9 N N-t 

+^EE((« + a))-m> ; 

JV t=l A=l 

lvarA(f + 2rr ) ) 

1 ~ T 2 \M _L O^M) 



V l (^-^)(l + 2rr0 (22) 

In going from the second to third line we assumed r^™^ <C iV and extended 
the summation over A to infinity. In the last line we replaced the variance by 
an estimate obtained from the sample. We see that the number of statistical 
uncorrelated samples is reduced from iV to N/(l + 2r^ m *' ) ). 

In many Monte Carlo simulations the error analysis is unfortunately not 
done accurately. Thus we wish to discuss this topic here in more detail. 

2.4.2 The binning analysis 

The binning analysis is a reliable way to estimate the integrated autocor- 
relation times. Starting from the original series of measurements Af^ with 
% = 1, . . . , N we iteratively create "binned" series by averaging over to con- 
secutive entries: 

A? := \ + , i = 1, • • • , Ni = N/2 1 . (23) 

These bin averages Af^ are less correlated than the original values A^ . The 
mean value is still the same. 

The errors AA^ l \ estimated incorrectly using equation (21) 



however increase as a function of bin size 2 l . For 2 l 3> r^™^ the bins become 
uncorrelated and the errors converge to the correct error estimate: 

AA = lim AA {1) . (25) 
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This binning analysis gives a reliable recipe for estimating errors and 
autocorrelation times. One has to calculate the error estimates for different 
bin sizes I and check if they converge to a limiting value. If convergence is 
observed the limit AA is a reliable error estimate, and r^™^ can be obtained 
from equation (22) as 



If however no convergence of the AA® is observed we know that T4 
is longer than the simulation time and we have to perform much longer 
simulations to obtain reliable error estimates. 

To be really sure about convergence and autocorrelations it is very im- 
portant to start simulations always on tiny systems and check convergence 
carefully before simulating larger systems. 

This binning analysis is implemented in the ALPS library which will be 
used in the hands-on session. 

2.4.3 Jackknife analysis 

The binning procedure is a straightforward way to determine errors and 
autocorrelation times for Monte Carlo measurements. For functions of mea- 
surements like U = (A)/ (B) it becomes difficult because of error propagation 
and cross-correlations. 

Then the jackknife procedure can be used. We again split the measure- 
ments into M bins of size N/M r^ nt ^ that should be much larger than 
any of the autocorrelation times. 

We could now evaluate the complex quantity U in each of the M bins 
and obtain an error estimate from the variance of these estimates. As each 
of the bins contains only a rather small number of measurements N/M the 
statistics will not be good. The jackknife procedure instead works with M + l 
evaluations of U. Uq is the estimate using all bins, and Ui for % = 1, . . . M 
is the value when all bins except the i-th bin are used. That way we always 
work with a large data set and obtain good statistics. 

The resulting estimate for U will be: 




(26) 



U = U -(M-1)(U-U ) 



(27) 



with a statistical error 



AU = y/M-1 




(28) 
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where 

i M 

TJ =mT,U 1 , (29) 
i=i 

The jackknife analysis is implemented in the ALPS library which will be 
used in the hands-on session. 

2.4.4 Equilibration 

Thermalization is as important as autocorrelations. The Markov chain con- 
verges only asymptotically to the desired distribution. Consequently, Monte 
Carlo measurements should be started only after a large number N eq of equi- 
libration steps, when the distribution is sufficiently close to the asymptotic 
distribution. N eq has to be much larger than the thermalization time which 
is defined similar to the autocorrelation time as: 

re,) _ gyjAA) - {Af ) 

~ (Ao)(A)-(Ay m 

It can be shown that the thermalization time is the maximum of all au- 
tocorrelation times for all observables and is related to the second largest 
eigenvalue A2 of the Markov transition matrix W by = — I/I11A2. It is 
recommended to thermalize the system for at least ten times the thermaliza- 
tion time before starting measurements. 



3 Classical Monte Carlo simulations 

Before getting to algorithms for quantum systems we first review the corre- 
sponding algorithms for classical systems, starting with the Ising model as 
the simplest model. 

3.1 The Ising model 

The Ising model is the simplest model for a magnetic system and a prototype 
statistical system. We will use it for our discussion of thermodynamic phase 
transitions. It consists of an array of classical spins <7j = ±1 that can point 
either up (a* = +1) or down (<7j = —1). The Hamiltonian is 

H = -J'£a i a j , (31) 

(id) 

where the sum goes over nearest neighbor spin pairs. 
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Two parallel spins contribute an energy of — J while two antiparallel ones 
contribute +J. In the ferromagnetic case the state of lowest energy is the 
fully polarized state where all spins are aligned, either pointing up or down. 

At finite temperatures the spins start to fluctuate and also states of higher 
energy contribute to thermal averages. The average magnetization thus de- 
creases from its full value at zero temperature. At a critical temperature T c 
there is a second order phase transition to a disordered phase. The Ising 
model is the simplest magnetic model exhibiting such a phase transition and 
is often used prototype model for magnetism. 

The thermal average of a quantity A at a finite temperature T is given 
by a sum over all states: 



where (5 = 1/ksT is the inverse temperature. Ai is the value of the quantity 
A in the configuration i. Ei is the energy of that configuration. 
The partition function 



normalizes the probabilities pi = exp(— (3Ei)/Z. 

For small systems it is possible to evaluate these sums exactly. As the 
number of states grows like 2 N a straight-forward summation is possible 
only for very small N. For large higher dimensional systems Monte Carlo 
summation/integration is the method of choice. 

3.2 The single spin flip Metropolis algorithm 

As was discussed in connection with integration it is usually not efficient 
to estimate the average (32) using simple sampling. The optimal method 
is importance sampling, where the states % are not chosen uniformly but 
with the correct probability pi, which we can again do using the Metropolis 
algorithm. 

The simplest Monte Carlo algorithm for the Ising model is the single spin 
flip Metropolis algorithm which defines a Markov chain through phase space. 

• Starting with a configuration q propose to flip a single spin, leading to 
a new configuration d. 

• Calculate the energy difference AE = E[d] — E[ci] between the config- 
urations d and q. 



(A) = -5>iexp(-/3£0, 



(32) 




(33) 



9 



• If AE < the next configuration is c i+1 = d 

• If AE > then Cj+i = d with probability exp(— (3AE), otherwise 
Q + i = Q. We do that by drawing a random number r uniformly dis- 
tributed in the interval [0, 1[ and set q + i = d if r < exp(— (3AE). 

• Measure all the quantities of interest in the new configuration. 

This algorithm is ergodic since any configuration can be reached from 
any other in a finite number of spin flips. It also fulfills the detailed balance 
condition. 

3.3 Systematic errors: boundary and finite size effects 

In addition to statistical errors due to the Monte Carlo sampling our simu- 
lations suffer from systematic errors due to boundary effects and the finite 
size of the system. 

Unless one wants to study finite systems with open boundaries, boundary 
effects can be avoided completely by using periodic boundary conditions. The 
lattice is continued periodically, forming a torus. The left neighbor of the 
leftmost spin is just the rightmost boundary spin, etc.. 

Although we can avoid boundary effects, finite size effects remain since 
now all correlations are periodic with the linear system size as period. Here 
is how we can treat them: 

• Away from phase transitions the correlation length £ is finite and finite 
size effects are negligible if the linear system size L 3> £. Usually 
L > 6£ is sufficient, but this should be checked for each simulation. 

• In the vicinity of continuous phase transitions we encounter a problem: 
the correlation length £ diverges. Finite size scaling can comes to the 
rescue and can be used to obtain the critical behavior. A detailed 
discussion of finite size scaling is beyond the scope of these notes. 

3.4 Critical behavior of the Ising model 

Close to the phase transition at T c again scaling laws characterize the behav- 
ior of all physical quantities. The average magnetization scales as 

m(T) = (\M\/V)K(T c -Tf, (34) 

where M is the total magnetization and V the system volume (number of 
spins). 
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The magnetic susceptibility x — ^k=o can be calculated from magneti- 
zation fluctuations and diverges with the exponent 7: 

x(TH Mf lK _ r (35 ) 

The correlation length £ is defined by the asymptotically exponential 
decay of the two-spin correlations: 

(oWr) - (H) 2 « exp(-r/£). (36) 

It is best calculated from the structure factor S(q) , defined as the Fourier 
transform of the correlation function. For small q the structure factor has a 
Lorentzian shape: 

^) = rr^ + °(? 4 )- (37) 

The correlation length diverges as 

£(p) oc |T - T c r . (38) 
At the critical point the correlation function itself follows a power law: 

(a a r ) oc r - {d - 2+ri) (39) 

where rj = 2(3/u — d + 2. 

The specific heat C(T) diverges logarithmically in two dimensions: 

C(T) oc \n\T-T c \ oc \T-T c \~ a (40) 

and the critical exponent a = 0. 

A good estimate of T c is obtained from the Binder cumulant 

U = l-^L (41) 

which has a universal value at p c , also the Binder cumulant has a universal 
value at T c . The curves of U(T) for different system sizes L all cross in one 
point at T c . This is a consequence of the finite size scaling ansatz: 

(M 4 ) = (T — T c ) 4/3 -u 4 ((T — T c )L l l v ) 

(M 2 ) = (T -T C ) 2 %((T -T c )L l / v ). (42) 

Thus 

U(TL)-l "4((r-T c )LV") 
11 



which for T = T c is universal and independent of system size L: 



U(T c ,L) = l 



«4(0) 



(44) 



3m 2 (0) 2 



High precision Monte Carlo simulations actually show that not all lines 
cross exactly at the same point, but that due to higher order corrections to 
finite size scaling the crossing point moves slightly, proportional to L^ 1 ^, al- 
lowing a high precision estimate of T c and v. For details of the determination 
of critical points and exponents see e.g. Ref [1, 2]. 

3.5 "Critical slowing down" and cluster Monte Carlo 
methods 

The importance of autocorrelation becomes clear when we wish to simulate 
the Ising model at low temperatures. The mean magnetization (to) is zero 
on any finite cluster, as there is a degeneracy between a configuration and its 
spin reversed counterpart. If, however, we start at low temperatures with a 
configuration with all spins aligned up it will take extremely long time for all 
spins to be flipped by the single spin flip algorithm. This problem appears 
as soon as we get close to the critical temperature, where it was observed 
that the autocorrelation times diverge as 



with a dynamical critical exponents z ~ 2 for all local update methods like 
the single spin flip algorithm. 

The reason is that at low temperatures it is very unlikely that even one 
spin gets flipped, and even more unlikely for a large cluster of spins to be 
flipped. The solution to this problem in the form of cluster updates was 
found in 1987 and 1989 by Swendsen and Wang [3] and by Wolff [4]. Instead 
of flipping single spins they propose to flip big clusters of spins and choose 
them in a clever way so that the probability of flipping these clusters is large. 

3.5.1 Cluster updates 

We use the Fortuin-Kastelyn representation of the Ising model, as generalized 
by Kandel and Domany. The phase space of the Ising model is enlarged by 
assigning a set Q of possible "graphs" to each configuration C in the set of 
configurations C. We write the partition function as 



r oc [min(£,L)] z . 



(45) 



z = E E w(c,G) 



(46) 



cgc Geg 
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where the new weights W(C, G) > are chosen such that Z is the partition 
function of the original model by requiring 

J2 W(C, G) = W(C) := eM-PE[C\), (47) 

Geg 

where E[C] is the energy of the configuration C. 

The algorithm now proceeds as follows. First we assign a graph G G Q 
to the configuration C, chosen with the correct probability 

P C (G)=W(C,G)/W(C). (48) 

Then we choose a new configuration C with probability p[(C, G) — > (C, G)], 
keeping the graph G fixed; next a new graph G' is chosen 

C -> (G, G) -> (C, G) -> C" -> (C, G') -> . . . (49) 

What about detailed balance? The procedure for choosing graphs with 
probabilities Pq obeys detailed balance trivially. The non-trivial part is 
the probability of choosing a new configuration C. There detailed balance 
requires: 

W(C, G)p[(C, G) - (C, G)] = W(C, G)p[(C, G) - (C, G)], (50) 
which can be fulfilled using either the heat bath algorithm 

Pl(C,G)^(C,G)^ w{c ^l c , G) (51) 

or by again using the Metropolis algorithm: 

p[(C, G) - (G', G)] = max(Vy(G', G)/W(G, G), 1) (52) 

The algorithm simplifies a lot if we can find a graph mapping such that 
the graph weights do not depend on the configuration whenever it is nonzero 
in that configuration. This means, we want the graph weights to be 

W(C, G) — A(G, G)V(G), (53) 

where 

Then equation (51) simply becomes p — 1/2 and equation (52) reduces to 
p = 1 for any configuration C with W(C, G) ^ 0. 
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Table 1: Local bond weights for the Kandel-Domany representation of the 
Ising model. 





c=TT c=|T c=U c=U 


V(g) 


A(c, discon.) 
A(c, con.) 


1111 

10 1 


e -DJ 

e /3J _ £ -pj 


w(c) 


exp(pj) exp(— f3J) exp(—/3J) exp(/5J) 





3.5.2 The cluster algorithms for the Ising model 

Let us now show how this abstract and general algorithm can be applied to 
the Ising model. Our graphs will be bond-percolation graphs on the lattice. 
Spins pointing into the same direction can be connected or disconnected. 
Spins pointing in opposite directions will always be disconnected. In the 
Ising model we can write the weights W(C) and W(C, G) as products over 
all bonds b: 

W(C) = Uw(C b ) (55) 

b 

W(C,G) = Uw(C b ,G b ) = HA(C b ,G b )V(G b ) (56) 
b b 

where the local bond configurations C b can be one of {Tt?lt\tl)ll} 

and the local graphs can be "connected" or "disconnected". The graph 
selection can thus be done locally on each bond. 

Table 1 shows the local bond weights w(c,g), w(c), A(c,g) and V(g). It 
can easily be checked that the sum rule (47) is satisfied. 

The probability of a connected bond is [exp(/3J) — exp(— (3 J)]/ exp(/5J) = 
1 — exp(— 2/3 J) if two spins are aligned and zero otherwise. These connected 
bonds group the spins into clusters of aligned spins. 

A new configuration C with the same graph G can differ from C only by 
flipping clusters of connected spins. Thus the name "cluster algorithms" . The 
clusters can be flipped independently, as the flipping probabilities p[(C, G) — > 
(C, G)] are configuration independent constants. 

There are two variants of cluster algorithms that can be constructed using 
the rules derived above. 

3.5.3 The Swendsen-Wang algorithm 

The Swendsen-Wang or multi-cluster algorithm proceeds as follows: 
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i) Each bond in the lattice is assigned a label "connected" or "discon- 
nected" according to above rules. Two aligned spins are connected 
with probability 1 — exp(— 2/3 J). Two antiparallel spins are never con- 
nected. 

ii) Next a cluster labeling algorithm, like the Hoshen-Kopelman algorithm 
is used to identify clusters of connected spins. 

hi) Measurements are performed, using improved estimators discussed in 
the next section. 

iv) Each cluster of spins is flipped with probability 1/2. 
3.5.4 The Wolff algorithm 

The Swendsen Wang algorithm gets less efficient in dimensions higher than 
two as the majority of the clusters will be very small ones, and only a few 
large clusters exist. The Wolff algorithm is similar to the Swendsen- Wang 
algorithm but builds only one cluster starting from a randomly chosen point. 
As the probability of this point being on a cluster of size s is proportional 
to s the Wolff algorithm builds preferebly larger clusters. It works in the 
following way: 

i) Choose a random spin as the initial cluster. 

ii) If a neighboring spin is parallel to the initial spin it will be added to 
the cluster with probability 1 — exp(— 2/3 J). 

iii) Repeat step ii) for all points newly added to the cluster and repeat this 
procedure until no new points can be added. 

iv) Perform measurements using improved estimators. 

v) Flip all spins in the cluster. 

We will see in the next section that the linear cluster size diverges with the 
correlation length £ and that the average number of spins in a cluster is just 
XT. Thus the algorithm adapts optimally to the physics of the system and 
the dynamical exponent z ~ 0, thus solving the problem of critical slowing 
down. Close to criticality these algorithms are many orders of magnitudes 
(a factor L 2 ) better than the local update methods. 
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3.6 Improved Estimators 

In this section we present a neat trick that can be used in conjunction with 
cluster algorithms to reduce the variance, and thus the statistical error of 
Monte Carlo measurements. Not only do these "improved estimators" reduce 
the variance. They are also much easier to calculate than the usual "simple 
estimators" . 

To derive them we consider the Swendsen-Wang algorithm. This algo- 
rithm divides the lattice into N c clusters, where all spins within a cluster are 
aligned. The next possible configuration is any of the 2 N ° configurations that 
can be reached by flipping any subset of the clusters. The idea behind the 
"improved estimators" is to measure not only in the new configuration but 
in all equally probable 2 Nc configurations. 

As simplest example we consider the average magnetization (m) . We can 
measure it as the expectation value (o^) of a single spin. As the cluster to 
which the spin belongs can be freely flipped, and the flipped cluster has the 
same probability as the original one, the improved estimator is 

W = (\&- <*)) = <>■ (57) 

This result is obvious because of symmetry, but we saw that at low temper- 
atures a single spin flip algorithm will fail to give this correct result since 
it takes an enormous time to flip all spins. Thus it is encouraging that the 
cluster algorithms automatically give the exact result in this case. 
Correlation functions are not much harder to measure: 

/ v _ ( 1 if i und j are on the same cluster 
J [0 otherwise 

To derive this result consider the two cases and write down the improved 
estimators by considering all possible cluster flips. 

Using this simple result for the correlation functions the mean square of 
the magnetization is 

(m 2 ) = ^ E(^j) = ±( E S{clusterf), (59) 

ij cluster 

where S(cluster) is the number of spins in a cluster. The susceptibility above 
T c is simply given by f3(m 2 ) and can also easily be calculated by above sum 
over the squares of the cluster sizes. 

In the Wolff algorithm only a single cluster is built. Above sum (59) can 
be rewritten to be useful also in case of the Wolff algorithm: 

(m 2 ) = ±{ E S(clusterf) 

cluster 



16 



i 

— Ts- 

N 2 - i N 



4 ^(cluster)), (60) 



where is the size of the cluster containing the initial site i. The expectation 
value for m 2 is thus simply the mean cluster size. In this derivation we 
replaced the sum over all clusters by a sum over all sites and had to divide 
the contribution of each cluster by the number of sites in the cluster. Next 
we can replace the average over all lattice sites by the expectation value for 
the cluster on a randomly chosen site, which in the Wolff algorithm will be 
just the one Wolff cluster we build. 

Generalizations to other quantities, like the structure factor S(q) are 
straightforward. While the calculation of S(q) by Fourier transform needs at 
least 0(N\nN) steps, it can be done much faster using improved estimators, 
here derived for the Wolff algorithm: 

(S(q)) = jpY, a ^ exp(^(f-f)) 



r,r' 
1 



NS(cluster) - ^ t ™ eX ^ f " ^ 

V / r,r'£cluster 



NS(cluster) 



^ exp(igr) 

f '^cluster 



2 



(61) 



This needs only 0(S (cluster)) operations and can be measured directly when 
constructing the cluster. 

Care must be taken for higher order correlation functions. Improved 
estimators for quantities like m 4 which need at least two clusters and cannot 
be measured in an improved way using the Wolff algorithm. 



3.7 Generalizations of cluster algorithms 

Cluster algorithms can be used not only for the Ising model but for a large 
class of classical, and even quantum spin models. The quantum version is 
the "loop algorithm", which will be discussed later in the course. In this 
section we discuss generalizations to other classical spin models. 

Before discussing specific models we remark that generalizations to mod- 
els with different coupling constants on different bonds, or even random cou- 
plings are straightforward. All decisions are done locally, individually for 
each spin or bond, and the couplings can thus be different at each bond. 
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3.7.1 Potts models 



g-state Potts models are the generalization of the Ising model to more than 
two states. The Hamilton function is 

# = -j£<5 w (62) 

(id) 

where the states can take any integer value in the range 1, . . . , q. The 
2-state Potts model is just the Ising model with some trivial rescaling. 

The cluster algorithms for the Potts models connect spins with probability 
1 - e~ pj if the spins have the same value. The clusters are then "flipped" to 
any arbitrarily chosen value in the range 1, . . . , q. 

3.7.2 O(N) models 

Another, even more important generalization are the O(N) models. Well 
known examples are the XF-model with N = 2 and the Heisenberg model 
with N = 3. In contrast to the Ising model the spins can point into any 
arbitrary direction on the iV-sphere. The spins in the XY model can point 
into any direction in the plane and can be characterized by a phase. The 
spins in the Heisenberg model point into any direction on a sphere. 
The Hamilton function is: 

H = -J^S i S j , (63) 

— * 

where the states Si are SO(N) vectors. 

Cluster algorithms are constructed by projecting all spins onto a random 
direction e G SO(N). The cluster algorithm for the Ising model can then be 
used for this projection. Two spins Si and Sj are connected with probability 

1 - exp (min[0, -2(3 J{e ■ Si)(e ■ Sj)]) . (64) 

The spins are flipped by inverting the projection onto the e-direction: 

§. _> ^ - 2(e • Si)e. (65) 

In the next update step a new direction e is chosen. 

4 The quantum Monte Carlo loop algorithm 

The "loop-algorithm" is a generalization of the classical Swendsen-Wang clus- 
ter algorithms to quantum lattice models. I will discuss it here in a path- 
integral representation. A slightly modified version will be discussed later in 
the context of the SSE representation. 
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4.1 Path-integral representation in terms of world lines 

I will discuss the loop algorithm for a spin-1/2 quantum XX Z model with 
the Hamiltonian 

H = -Y,(j z S*S* + J xy (S?S* + SfS])) 

= - E (JzS?S? + ^f{SfS- + S~S+)) . (66) 

For J = J z = J xy we have the Heisenberg model (J > is ferromagnetic, 
J < antiferromagnetic). J xy = is the (classical) Ising model and J z = 
the quantum XY model. 

In the quantum Monte Carlo simulation we want to evaluate thermody- 
namic averages such as 

. 4 . TrAe-P H 

{A) = ~r^r- (67) 

The main problem is the calculation of the exponential e~^ H . The straight- 
forward calculation would require a complete diagonalization, which is just 
what we want to avoid. We thus discretize the imaginary time (inverse tem- 
perature) direction 1 and subdivide (3 = MAr: 

e-? H = (e~^ H ) M = (1 - ArH) M + O(Ar) (68) 

In the limit M — > oo (At — > 0) this becomes exact. We will take the limit 
later, but stay at finite Ar for now. 

The next step is to insert the identity matrix, represented by a sum over 
all basis states 1 = between all operators (1 — AtH): 

Z = Tre-? H = Tr(l- AtH) m + 0(At) 

= E {h\l-ArH\i 2 }(i 2 \l-ArH\i 3 }---(i M \l-ATH\i 1 } + 0(Ar) 

ii,...,iM 

■ Piu-,iM (69) 
and similarly for the measurement, obtaining 

ll,...,l M N 11 1 Z/ 



1 Time evolution in quantum mechanics is e ltH . The Boltzman factor e @ H thus 
corresponds to an evolution in imaginary time t = —i(3 
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Figure 1: Example of a world line configuration for a spin-1/2 quantum 
Heisenberg model. Drawn are the world lines for up-spins only. Down spin 
world lines occupy the rest of the configuration. 

If we choose the basis states \i) to be eigenstates of the local S z operators 
we end up with an Ising-like spin system in one higher dimension. Each choice 
ii, . . . corresponds to one of the possible configurations of this classical 
spin system. The trace is mapped to periodic boundary conditions in the 
imaginary time direction of this classical spin system. The probabilities are 
given by matrix elements (i n \l—ArH\i n+ i). We can now sample this classical 
system using classical Monte Carlo methods. 

However, most of the matrix elements (i n \l — ArH\i n+ i) are zero, and 
thus nearly all configurations have vanishing weight. The only non-zero con- 
figurations are those where neighboring states \i n ) and |i n +i) are either equal 
or differ by one of the off-diagonal matrix elements in H, which are nearest 
neighbor exchanges by two opposite spins. We can thus uniquely connect 
spins on neighboring "time slices" and end up with world lines of the spins, 
sketched in Fig. 1. Instead of sampling over all configurations of local spins 
we thus have to sample only over all world line configurations (the others 
have vanishing weight). Our update moves are not allowed to break world 
lines but have to lead to new valid world line configurations. 

4.2 The loop algorithm 

Until 1993 only local updates were used, which suffered from a slowing down 
like in the classical case. The solution came as a generalization of the cluster 
algorithms to quantum systems [5, 6]. 

This algorithm is best described by first taking the continuous time limit 
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Table 2: The six local configurations for an XX Z model and their weights. 



configuration 


weight 


S,<X + OX)f 
S,<T)f 




S/(x + dt){ 
S,<x) { 


f S/x+cft) 

T s/x) 




S,<x+dx)f 

S,<T)| 


{ S/X+dl) 

} S/x) 


Sj(x + dx){ 
S,<x){ 


| S/x+dx) 
f S/x) 


\- J -fdr 


S,<X+dx){ 

S,<x)f 


| S/x+ax) 
\ S/x) 


S,(x+ax)| 
S,<x) { 

! 


{ S/x+dx) 
| S/x) 


J -^dr 



a)^ ^, b)0— 0, c)^^, d) 

Figure 2: The four local graphs: a) vertical, b) horizontal c) crossing and d) 
freezing (connects all four corners). 




M — » oo (At — > g?t) and by working with infinitesimals. Similar to the Ising 
model we look at two spins on neigboring sites i and j at two neighboring 
times r and r + dr, as sketched in Table 2. There are a total of six possible 
configurations, having three different probabilities. The total probabilities 
are the products of all local probabilities, like in the classical case. This is 
obvious for different time slices. For the same time slice it is also true since, 
denoting by Hij the term in the Hamiltonian H acting on the bond between 
sites i and j we have — drH^) = 1 — drJ2(ij) = 1 — drH. In the 

following we focus only on such local four-spin plaquettes. Next we again 
use the Kandel-Domany framework and assign graphs. As the updates are 
not allowed to break world lines only four graphs, sketched in Fig. 2 are 
allowed. Finally we have to find A functions and graph weights that give 
the correct probabilities. The solution for the XF-model, ferromagnetic and 
antiferromagnetic Heisenberg model and the Ising model is shown in Tables 
3-6. 

Let us first look at the special case of the Ising model. As the exchange 
term is absent in the Ising model all world lines run straight and can be 
replaced by classical spins. The only non-trivial graph is the "freezing", 
connecting two neighboring world lines. Integrating the probability that two 
neighboring sites are nowhere connected along the time direction we obtain: 
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Table 3: The graph weights for the quantum-Xl" model and the A function 
specifying whether the graph is allowed. The dash - denotes a graph that is 
not possible for a configuration because of spin conservation and has to be 
zero. 



G 


t t 

A(t t,G) 
{ 1 
= A(t 

V * / 


t 1 

A(t *,G) 

| t 

= A(f t,G) 


1 t 

A(t *,G) 
= A(I t,G) 

V * / 


graph weight 




1 


1 




1 - ^dr 


0-© 
0-© 




1 


1 






1 




1 


^fdr 


Gy^ 














total weight 


1 


1 







Table 4: The graph weights for the ferromagnetic quantum Heisenberg model 
and the A function specifying whether the graph is allowed. The dash - 
denotes a graph that is not possible for a configuration because of spin con- 
servation and has to be zero. 



G 


t t 

A(l t,G) 

= A(I ♦,<?) 


1 1 

AO *,G) 
I t 
= A(* t,G) 


1 t 

AO *,G) 
1 i 
= A(* t,G) 


graph weight 


11 


1 


1 






0-© 
0-© 















1 




1 


fdr 
















total weight 


1 + ^dr 


1-idr 


^dr 
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Table 5: The graph weights for the antiferro magnetic quantum Heisenberg 
model and the A function specifying whether the graph is allowed. The 
dash - denotes a graph that is not possible for a configuration because of 
spin conservation and has to be zero. To avoid the sign problem (see next 
subsection) we change the sign of J xy , which is allowed only on bipartite 
lattices. 



G 


t t 

A(t t,G) 
i i 

= A(* \,G) 


t 1 

A(t 

* t 

= A(* t,G) 


I t 

A(t 

= A(I t,G) 


graph weight 




1 


1 






0-© 
0-© 




1 


1 































total weight 




1 + 


if* 
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Table 6: The graph weights for the ferromagnetic Ising model and the A 
function specifying whether the graph is allowed. The dash - denotes a 
graph that is not possible for a configuration because of spin conservation 
and has to be zero. 



G 


t t 

A(t t,G) 


t 1 

A(t *,G) 

V J / 


A(t *,G) 


graph weight 




1 


1 






0-© 
0-© 




























1 










total weight 


1 + 










times: 

IJ(1 - dr.J/2) = lim (1 - ArJ/2) M = exp(-/3J/2) (71) 

r=0 

Taking into account that the spin is S — 1/2 and the corresponding classical 
coupling J c i = S 2 J = J/4 we find for the probability that two spins are 
connected: 1 — exp(— 2(3 J d ). We end up exactly with the cluster algorithm 
for the classical Ising model! 

The other cases are special. Here each graph connects two spins. As each 
of these spins is again connected to only one other, all spins connected by a 
cluster form a closed loop, hence the name "loop algorithm" . Only one issue 
remains to be explained: how do we assign a horizontal or crossing graph with 
infinitesimal probability, such as (J/2)dr. This is easily done by comparing 
the assignment process with radioactive decay. For each segment the graph 
runs vertical, except for occasional decay processes occuring with probability 
(J/2)dr. Instead of asking at every infinitesimal time step whether a decay 
occurs we simply calculate an exponentially distributed decay time t using an 
exponential distribution with decay constant J/2. Looking up the equation 
in the lecture notes of the winter semester we have t = — (2/J)ln(l — u) 
where u is a uniformly distributed random number. 
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world lines 



world lines + 
decay graphs 



world lines 

after flips of some 

loop clusters 



Figure 3: Example of a loop update. In a first step decay paths are inserted 
where possible at positions drawn randomly according to an exponential 
distribution and graphs are assigned to all exchange terms (hoppings of world 
lines). In a second stage (not shown) the loop clusters are identified. Finally 
each loop cluster is flipped with probability 1/2 and one ends up with a new 
configuration. 

The algorithm now proceeds as follows (see Fig. 3): for each bond we 
start at time and calculate a decay time. If the spins at that time are 
oriented properly and an exchange graph is possible we insert one. Next 
we advance by another randomly chosen decay time along the same bond 
and repeat the procedure until we have reached the extent (3. This assigns 
graphs to all infinitesimal time steps where spins do not change. Next we 
assign a graph to all of the (finite number of) time steps where two spins are 
exchanged. In the case of the Heisenberg models there is always only one 
possible graph to assign and this is very easy. In the next step we identify 
the loop-clusters and then flip them each with probability 1/2. Alternatively 
a Wolff-like algorithm can be constructed that only builds one loop-cluster. 

Improved estimators for measurements can be constructed like in classical 
models. The derivation is similar to the classical models. I will just men- 
tion two simple ones for the ferromagnetic Heisenberg model. The spin-spin 
corelation is 



^(t)SJ(t') = 



1 if (i, t) und (J, r') are on the same cluster 
otherwise 



(72) 



and the uniform susceptibilty is 



X = ^gX>(c) 2 , 



(73) 
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where the sum goes over all loop clusters and S(c) is the length of all the 
loop segments in the loop cluster c. 

For further information on the loop algorithm I refer to the recent review 
by Evertz [7]. 



4.3 The negative sign problem 

Now that we have algorithms with no critical slowing down we could think 
that we have completely solved the problem of quantum many body prob- 
lems. 

There is however the negative sign problem which destroys our dreams. 
We need to interpret the matrix elements (i n \l — ArH\i n+1 ) as probablities, 
which requires them to be positive. However all off-diagonal positive matrix 
elements of H arising from Fermi statistics give rise to a negative probability 
in fermionic systems and in frustrated quantum magnets. 

The simplest example is the exchange term —{J xy /2)(Sf S~ + S~ Slf) in 
the Hamiltonian (66) in the case of an antiferromagnet with J xy < 0. For 
any bipartite lattice, such as chains, square lattices or cubic lattices with 
there is always an even number of such exchanges and we get rescued once 
more. For non-bipartite lattices (such as a triangular lattice), on which the 
antiferromagnetic order is frustrated there is no way around the sign problem. 
Similarly a minus sign occurs in all configurations where two fermions are 
exchanged. 

Even when there is a sign problem we can still do a simulation. Instead 
of sampling 

_ JA(x)p(x)dx 
{)p - Ip(x)dx (74j 
we rewrite this equation as [8] 

J A(x)sign(p(x))\p(x)\dx 

/ m = I \p(*)\d* = Q4-signp)| p | 

J sign(p(x))\p(x)\dx (signp)| p | 
J \p{x)\dx 

We sample with the absolute values \p\ and include the sign in the observ- 
able. The "sign problem" is the fact that the errors get blown up by an 
additional factor 1/ (signp) i p i , which grows exponentially with volume and 
inverse temperature f3, as (signp)| p | oc exp(— const x f3N). Then we are un- 
fortunately back to exponential scaling. Many people have tried to solve the 
sign problem using basis changes or clever reformulations, but - except for 
special cases - nobody has succeeded yet. In fact we could show that in some 
cases the sign problem is NP-hard and a solution thus all but impossible [9] . 
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If you want you can try your luck: the person who finds a general solution 
to the sign problem will surely get a Nobel prize! 

5 Exact diagonalization 

5.1 Creating the basis set and matrix 

The most accurate method for quantum lattice models is exact diagonaliza- 
tion of the Hamiltonian matrix using the Lanczos algorithm. The size of the 
Hilbert space of an A-site system [4 N for a Hubbard model , 3^ for a i-J 
model and (2S + 1)^ for a spin-5 1 model] can be reduced by making use of 
symmetries. Translational symmetries can be employed by using Bloch waves 
with fixed momentum as basis states. Conservation of particle number and 
spin allows to restrict a calculation to subspaces of fixed particle number and 
magnetization. 

As an example I will sketch how to implement exact diagonalization for a 
simple one-dimensional spinless fermion model with nearest neighbor hopping 
t and nearest neighbor repulsion V: 

L-1 L-1 

H = -tJ2 (4^+1 + H.c.) + rnni + i. (76) 

i=l i 

The first step is to construct a basis set. We describe a basis state as 
an unsigned integer where bit % set to one corresponds to an occupied site 
%. As the Hamiltonian conserves the total particle number we thus want to 
construct a basis of all states with A particles on L sites (or A bits set to 
one in L bits). The function state (i) returns the state corresponding to 
the i-th basis state, and the function index(s) returns the number of a basis 
state s. 

#include <vector> 
#include <alps/bitops .h> 

class FermionBasis { 
public : 

typedef unsigned int state_type; 
typedef unsigned int index_type; 
FermionBasis (int L, int N) ; 

state_type state (index_type i) const {return states. [i] ;} 
index_type index (state_type s) const {return index_[s];} 
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unsigned int dimensionO const { return states. . size (); } 
private : 

std: : vector<state_type> states. ; 
std: : vector<index_type> index. ; 

>; 

FermionBasis : :FermionBasis(int L, int N) 
{ 

index_ .resize (1«L) ; // 2~L entries 
for (state_type s=0;s<index_.size() ;++s) 
if (alps : :popcnt (s)==N) { 

// correct number of particles 

states. .push_back(s) ; 

index. [s] =states_ . size () -1 ; 

> 

else 

// invalid state 

index_ [s] =std: :numeric_limits<index_type>: :max() ; 

} 

Next we have to implement a matrix-vector multiplication v = Hw for 
the Hamiltonian: 

class HamiltonianMatrix : public FermionBasis { 
public : 

HamiltonianMultiplier (int L, int N, double t, double V) 
: FermionBasis (L , N) , t_(t), V_(V), L_(L) {} 

void multiply (std: : valarray<double>& v, const std: : valarray<double>& 

private : 

double t_, V_; 
int L_; 

} 

void HamiltonianMatrix: : multiply (std: : valarray<double>& v, 
const std: : valarray<double>& w) 

{ 

// do the V-term 
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for (int i=0; i<dimension() ;++i) 
{ 

state_type s = state (i); 

// count number of neighboring fermion pairs 
v[i]=w[i]*V_*alps: :popcnt(s&(s»l)) ; 

} 

// do the t-term 

for (int i=0; i<dimension() ;++i) 

{ 

state_type s = state(i); 

for (int r=0;r<L_-l;++r) { 

state_type shop = s~(3<<r); // exchange two particles 
index_type idx = index(shop); // get the index 
if (idx!=std: :numeric_limits<index_type> : :max()) 
v [idx] +=w [i] *t ; 

> 

} 

} 

This class can now be used together with the Lanczos algorithm to cal- 
culate the energies and wave functions of the low lying states of the Hamil- 
tonian. 

5.2 The Lanczos algorithm 

Sparse matrices with only O(N) non-zero elements are very common in scien- 
tific simulations. We have already encountered them in the winter semester 
when we discretized partial differential equations. Now we have reduced the 
transfer matrix of the Ising model to a sparse matrix product. We will later 
see that also the quantum mechanical Hamilton operators in lattice models 
are sparse. 

The importance of sparsity becomes obvious when considering the cost 
of matrix operations as listed in table 7. For large N the sparsity leads to 
memory and time savings of several orders of magnitude. 

Here we will discuss the iterative calculation of a few of the extreme 
eigenvalues of a matrix by the Lanczos algorithm. Similar methods can be 
used to solve sparse linear systems of equations. 

To motivate the Lanczos algorithms we will first take a look at the power 
method for a matrix A. Starting from a random initial vector u\ we calculate 
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Table 7: Time and memory complexity for operations on sparse and dense 
N x N matrices 



operation 


time 


memory 


storage 






dense matrix 


— 


7V TO 

N 2 


sparse matrix 


— 


O(N) 


matrix-vector multiplication 






dense matrix 


0(iV 2 ) 


0(N 2 ) 


sparse matrix 


0(N) 


O(N) 


matrix-matrix multiplication 






dense matrix 


( N s ' s ) 


0(N 2 ) 


sparse matrix 


0{N)...0(N 2 ) 


0(N)...0(N 2 ) 


all eigen values and vectors 






dense matrix 


0(iV 3 ) 


0(N 2 ) 


sparse matrix (iterative) 


0(N 2 ) 


0(N 2 ) 


some eigen values and vectors 






dense matrix (iterative) 


0(N 2 ) 


0(N 2 ) 


sparse matrix (iterative) 


0(N) 


O(N) 



the sequence 

Un+l = P^' (77) 
which converges to the eigenvector of the largest eigenvalue of the matrix A. 
The Lanczos algorithm optimizes this crude power method. 

5.2.1 Lanczos iterations 

The Lanczos algorithm builds a basis {i>i, v 2, . . . , vm} for the Krylov-subspace 
Km = span{ui,U2, ■ ■ ■ ,%}, which is constructed by M iterations of equa- 
tion (77). This is done by the following iterations: 

(3 n+1 v n+1 = Av n - a n v n - pnVn-i, (78) 

where 

a n = vlAv n , p n = \v^Av n -i\. (79) 
As the orthogonality condition 

vjvj = 8^ (80) 

does not determine the phases of the basis vectors, the $ can be chosen to 
be real and positive. As can be seen, we only need to keep three vectors of 
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size N in memory, which makes the Lanczos algorithm very efficient, when 
compared to dense matrix eigensolvers which require storage of order N 2 . 
In the Krylov basis the matrix A is tridiagonal 



y(«) •— 



Oil 02 

o ••. •• 











(81) 







j3 n ct n _ 



The eigenvalues {ri, . . . , tm} of T are good approximations of the eigen- 
values of A. The extreme eigenvalues converge very fast. Thus M <C N 
iterations are sufficient to obtain the extreme eigenvalues. 

5.2.2 Eigenvectors 

It is no problem to compute the eigenvectors of T. They are however given in 
the Krylov basis {vi, t> 2 , . . . , %}. To obtain the eigenvectors in the original 
basis we need to perform a basis transformation. 

Due to memory constraints we usually do not store all the Vi, but only 
the last three vectors. To transform the eigenvector to the original basis we 
have to do the Lanczos iterations a second time. Starting from the same 
initial vector vi we construct the vectors Vi iteratively and perform the basis 
transformation as we go along. 

5.2.3 Roundoff errors and ghosts 

In exact arithmetic the vectors {v^ are orthogonal and the Lanczos iterations 
stop after at most N — 1 steps. The eigenvalues of T are then the exact 
eigenvalues of A. 

Roundoff errors in finite precision cause a loss of orthogonality. There are 
two ways to deal with that: 

• Reorthogonalization of the vectors after every step. This requires stor- 
ing all of the vectors {v^ and is memory intensive. 

• Control of the effects of roundoff. 

We will discuss the second solution as it is faster and needs less memory. 
The main effect of roundoff errors is that the matrix T contains extra spurious 
eigenvalues, called "ghosts". These ghosts are not real eigenvalues of A. 
However they converge towards real eigenvalues of A over time and increase 
their multiplicities. 



31 



A simple criterion distinguishes ghosts from real eigenvalues. Ghosts are 
caused by roundoff errors. Thus they do not depend on on the starting vector 
v\. As a consequence these ghosts are also eigenvalues of the matrix T, which 
can be obtained from T by deleting the first row and column: 



«2 Ai 

A «3 

o ••. 

••• 







Pn 






I3n 
Qtr), 



(82) 



From these arguments we derive the following heuristic criterion to dis- 
tinguish ghosts from real eigenvalues: 

• All multiple eigenvalues are real, but their multiplicities might be too 
large. 

• All single eigenvalues of T which are not eigenvalues of T are also real. 

Numerically stable and efficient implementations of the Lanczos algo- 
rithm can be obtained from netlib. As usual, do not start coding your own 
algorithm but use existing optimal implementations. 
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